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Abstract 

We investigate both the locus of ambiguity in the architecture 
of language and the origin of ambiguity in natural communication. 
We 1) locate ambiguity at the externalization branch of language, 2) 
provide a rigorous, general definition of ambiguity through the con- 
cept of logical irreversibility, quantifying the amount of ambiguity 
within the framework of Shannon's information theory, and 3) pro- 
vide a proof of concept that the constraints acting over a natural code 
force ambiguity to emerge. Accordingly, the emergence of ambiguous 
codes is an unavoidable byproduct of efficient communication. 
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1 Introduction 



It is a common observation that natural languages are ambiguous, namely, 
that linguistic utterances can potentially be assigned more than one in- 
terpretation and that receivers of linguistic utterances need to resort to 
supplementary information (i.e., the linguistic or the communicative con- 
text) to choose one among the available interpretations. 

Both linguists and logicians have been interested in this observation. On 
the one hand, a traditional tasks of grammar is to illustrate and classify 
ambiguity, which may be of different types, as well as to determine how 
apparently ambiguous utterances are disambiguated at the relevant level of 
representation; indeed, the search for a parsimonious treatment of certain 
ambiguities such as scope ambiguities has been one of the most powerful 
motors in the development of the formal inquiry of the syntax-semantics 
interface, since its modern inception in Montague's semiotic program |Mon- 



tague (1974). It is no exaggeration at all, in our opinion, to say that the 
presence of ambiguity (particularly, scope ambiguity) and the apparent mis- 
match between the form and the alleged semantic structure of quantified 
expressions in natural languages have been the two major guiding problems 
in the development of a rigorous theory of the syntax-semantics interface 
of natural languages. 

On the other hand, logicians in general would not be as interested in describ- 
ing or characterizing the phenomenon of ambiguity, as in the construction 
of unambiguous artificial languages whose primitive symbols have a univo- 
cal interpretation and whose formulae are constructed by the appropriate 
recursive syntactic definitions and unambiguously interpreted by the rele- 
vant compositional semantic rules, formulated as recursive definitions that 
trace back the syntactic construction of the formulae. Not surprisingly. 



some philosophers, such as (Wittgenstein 1922 3.323-3.325), identified the 



ambiguity of ordinary language as the source of philosophical confusion, 
and aspired to construct a language whose signs were univocal and whose 
propositions mirrored the logical structure of reality itself. 

It is a rather common view that the presence of phenomena such as ambi- 
guity and garden path sentences suggests that language is poorly designed 



for communication Chomsky (2008). In fact, there are at least two opposite 
starting hypotheses about the nature and emergence of ambiguity in natu- 
ral communication systems. It could well be that ambiguity is an intrinsic 
imperfection, since natural, self-organized codes of communication are not 
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perfect, but coevolved through a fluctuating medium and no-one designed 
them. But it may be as weh that ambiguity is the result of an optimiza- 
tion process, inasmuch as natural, self-organized codes of communication 
must satisfy certain constraints that a non-ambiguous artificial language 
can afford to neglect. Logical languages, for instance, are constructed to 
study relations such as logical consequence and equivalence among well- 
formed formulae, for which it is desirable to define syntactic rules that do 
not generate syntactically ambiguous expressions.^ However, in the design 
of logical languages that provide the appropriate tools for that particular 
purpose, certain features that may be crucial in the emergence of natural 
communication systems are neglected, such as the importance of the cost in 



generating expressions and the role of cooperation Wang ( 2008 ) between the 
coder and the decoder in the process of communicating those expressions, 
a factor that is completely extraneous to the design of logical languages. 

Despite the indisputability of ambiguity in natural languages and the atten- 
tion that this observation has received among linguists, philosophers and 
logicians, it is fair to conclude that the emergence of ambiguity has not 
come yet under serious theoretical scrutiny. 

In this article we firstly provide a rather extensive review of ambiguity 
within grammar. We discuss several lexical, syntactic, phonological and 
semantic aspects involved in ambiguity and we argue for the idea that am- 
biguity appears at the externalization branch of language. In brief, it is 
shown that ambiguity appears because phonetic forms dispense with part 
of the information that is present in logical forms. Secondly, we provide 
a general mathematical definition of ambiguity through the computational 
concept of logical irreversibility and we rigorously quantify the ambiguity 
of a code in terms of the amount of uncertainty of the reversal of the coding 
process. We finally capitalize on the two above-mentioned factors (the im- 
portance of the cost in generating expressions and the role of cooperation 
between the coder and the decoder) in order to provide a rigorous argument 
for the idea that ambiguity is an unavoidable consequence of the following 
efficiency factor in natural communication systems: interacting commu- 
nicative agents must attain a code that tends to minimize the complexities 
of both the coding and the decoding processes. As a proof of concept, we 
rigorously explore a simple system based on two agents -coder and dccoder- 
under a symmetrical -cooperative- scenario, and we show that ambiguity 
must emerge. 

The remainder of this article is organized as follows. In Section [2] we illus- 
trate several well-known types of ambiguity and we articulate the idea that 
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ambiguity appears because the phonetic form (which feeds the Articulatory- 
Perceptual system) dispenses with part of the information present in the 
logical form (which feeds the Conceptual-Intentional system). After locat- 
ing the emergence of ambiguity in the architecture of language, we proceed 
to develop our theoretical argument for the emergence of ambiguity from 
communicative efficiency factors. In Section [3] we introduce Landauer's 
concept of logical (ir)reversibility of a given computational device and we 
quantify the degree of ambiguity of a code as the amount of uncertainty 
in reversing the coding process. In Section |4] we introduce Zipf 's vocabu- 
lary balance condition, a particular instance of Zipf 's Least Effort Principle 



(Zipf 1949), and we show how it can be properly generalized and accomo- 
dated to the information-theoretic framework adopted in Section [3J We 
conclude our reasoning by showing that if the coding and the decoding 
processes are performed in a cooperative regime expressed in terms of a 
symmetry equation between coding and decoding complexities, a certain 
amount of logical uncertainty or ambiguity is unavoidable. In Section [5] 
we recapitulate our derivation of the presence of ambiguity and we stress a 
further important result intimately related to our development: Zipf 's law, 
another well-known and ubiquitous feature of natural language products, 
is the sole expected outcome of an evolving communicative system that 
satisfies the symmetry equation between coding and decoding complexi- 



ties, as argued in Corominas-Murtra et al. (2011). Appendix [A| emphasizes 



the relationship between logical irreversibility and thermodynamical irre- 
versibility. 



The locus of ambiguity in the architecture 
of language 



In this preliminary section we would like to reflect about the locus of am- 
biguity in language. We shall adapt Chomsky's general assumptions about 



the architecture of language (Chomsky 1995). Let a particular language C 



be constituted of a lexicon Lex and a generative procedure F that generates 
a set A of structural descriptions of the form (tt, A), where w is an element 
of the set 11 of phonetic forms and A an element of the set A of logical 
forms. The set A of structural descriptions of C generated by F is the set 
of expressions of £. We thus view a language £ as a tuple 

C = (Lex,r,A). 
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The Articulatory-Perceptual (A-P) system and the Conceptual-Intentional 
(C-I) system are the external systems of language. The elements of 11 are 
interpreted at the A-P system and provide it with the necessary instructions 
to generate acoustic signs in an oral language and patterns of visual signs 
in a sign language, and the elements of A are interpreted at the C-I system, 
where an expression is semantically interpreted. 

r is assumed to have an initial branch Tinu responsible for generating a 
set S of hierarchical representations whose terminals are lexical items that 
provide constituency and chain relationships; once S has been generated, 
r splits into two branches: (i) the externalization branch Text, which gen- 
erates n from S, and (ii) the abstract branch Tats, which generates A from 
S. We shall refer to this splitting model as the inverted- Y model. 

Lex 




Figure 1: Inverted- Y model 



We shall immediately argue that ambiguity appears because II dispenses 
with part of the information that is present in A. In other words, the branch 
Text eliminates or does not contain some information that is preserved or 
contained at the branch Tats and necessary for the C-I system. Once we 
have identified the loss of resolution in Text as the locus of ambiguity in 
language, we shall investigate in the following sections whether the low 
resolution of II is a communicative imperfection or an avoidable byproduct 
of certain efficiency considerations in natural communication systems. 
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2.1 Lexical ambiguity 

Let us first briefly consider lexical ambiguity. As is well known, we can dis- 
tinguish two types of lexical ambiguity: homonymy, which can be partial or 



absolute Lyons (1995), and polysemy. Following Lyons, absolute homonyms 



can be defined in the following terms. 

Two lexemes are absolute homonym^s iff: 

1. they are unrelated in meaning 

2. all their forms are identical 

3. the identical forms will be grammatically equivalent 

Lyons (1995: 55) 



Two lexemes are partial hom,onym,s if there is identity of at least one form 
and not all three conditions are satisfied; accordingly, the two lexical en- 
tries banki ("financial institution") and bank2 ("sloppy side of a river") 



constitute classical examples of absolute homonymy (Lyons 1995 p. 55), 
but the verbs find and found illustrate partial homonymy, since they share 
the form found, but not the forms find, findings, or founds, founding (they 
do not satisfy condition (2)). 

Whereas homonymy is a relation between lexemes, polysemy is a property 
of single lexemes. A lexeme is said to be polysemic if it possesses several 
meanings. The criteria to distinguish polysemy and homonymy are ety- 
mology and rclatcdness of meaning in such a way that etymology generally 
suports the intuitions of speakers about relatedness of meaning, but not 



always. Indeed, as observed by Lyons (Lyons 1995 p. 59): 



"It sometimes happens that lexemes which the average speaker 
of the language thinks of as being semantically unrelated have 
come from the same source", as in the case of solei ("bottom 
of foot or shoe" ) and sole2 ( "kind of fish" ) . 



It is not surprising that etymology tends to support the intuitions of native 
speakers in the general case, provided that metaphorical extension is both 
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an important factor in semantic change and a synchronic process at work in 
defining semantic relatedness. Ahhough the distinction between polysemy 
(also called complementary ambiguity) and homonymy (or contrastive am- 
biguity) may be fuzzy and vary among speakers, it is usually assumed to 
reflect central properties of the lexicon. We can view, for concreteness, a 
lexical entry as a set whose single element is a pair constituted of a signifier 
and a signified; in the case that two lexemes lexemei, lexeme2 are abso- 
lute homonyms, their respective signifiers signi fieri, signifier2, turn out 
to be phonologically identical, although they are etymologically unrelated 
and their signifieds are unconnected. 

lexemei — {< signi fieri, signi fiedi >} 

lexeme2 ~ {< signi fier2, signi fied2 >} 

Following what we can call a static conception of the lexicon, a polysemic 
lexemen could be listed as a single entry with multiple signifieds associated 
to a single signifier, that is as a set of several pairs all constituted of the 
signi fiern and a different signified. 

(1) lexemen — {< signifiern, signifiedP^ >, < signifiern, signified^^ >, ...} 



However, as argued by Pustejovsky (Pustejovsky 1995, chapter 4), this 
static conception of the lexicon (or Sense Enumerative Lexicon, in Puste- 
jovsky's terminology) is not adequate to provide an explicative treatment 
of polysemy in natural languages. Polysemy, as part of the creative use 
of words ("how words can take an infinite number of meanings in novel 



contexts" (Pustejovsky 1995 p. 42)), contributes to the expressive power 
of natural languages, but the static conception is unable to address the 
question of what the logical relationship among the multiple meanings of 
a lexeme is. According to Pustejovsky's generative model of the lexicon, 
there is one basic meaning for an apparently polysemic verb as bake, and 
"any other readings are derived through generative mechanisms in compo- 
sition with its arguments" . In this case, the change of state sense of bake is 
enumerated in the lexicon, but not the creation sense, which is generated 
when the verb is combined with a DP that denotes an artifact. 



(2) a. John baked a potato [change of state] 
b. John baked a cake [creation sense] 
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Therefore, apparently polysemic lexemes would not be listed as in |(l)| but 
as in|(3)[ 



(3) lexemen = {< signifiern,signifiedn >} 

Accordingly, all lexical entries would be a singleton of a pair of a signifier 
and a signified. When two lexemes are homonyms we postulate two differ- 
ent signifiers that are related to two different signifieds, and when a single 
lexeme is apparently polysemic the lexicon contains a regular entry com- 
pounded of a single signifier and a single signified, other signifieds being 
derived from the basic signified and generative mechanisms. Therefore, fol- 
lowing Pustejovsky's approach, there would be no truly ambiguous lexical 
entry, i.e., no lexical entry where a single signified is associated to multiple 
signifieds. Semantic representations generated during the C-I system puta- 
tively keep track of the lexical meaning, and build the meaning of complex 
expressions on the basis of the meaning of lexical items and the syntactic 
operations that generate them; consequently there is no lexical ambiguity 
generated during Tinu and Tats- So-called lexical ambiguity does not ap- 
pear at Lex, Tinit or Tabs, but only at Tg^t when a signifier is transferred 
and a hearer can potentially assign it more than a signified and thus needs 
to resort to the communicative or to the linguistic context to hypothesize 
the signified that the speaker intended to convey. 



2.2 Syntactic ambiguity 

Ambiguity is not only lexical (when a hearer can potentially assign more 
than one signified to a signifier), but also syntactic.^ Syntactic ambiguity 
can be relative to syntactic constituency or to chain-formation, as illus- 
trated in 1(4)1 and |(5)| respectively. In (4-a) the PP with Mary modifies 



the VP talked about your story, whereas in (4-b)| it modifies the DP your 



story. In |(5)[ the copies of the lexical item when in the base-position are 



in bold; in (5-a) when is base-generated in the VP headed by said and in 
(5-b) in the VP headed by leave. Intermediate copies of when as well as 
internal merge operations of syntactic categories other than when are not 
represented in these very partial analyses. 



(4) 



I talked about your story with Mary 

a. [tp I [vp [vp talked [pp about your story] [pp with Mary] 
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b. [tp I [vp talked [pp about [dp your story [pp with Mary]]]]] 

(5) I wonder when Mary said John would leave 

a. [tp I wonder [cp when Mary [vp said when [cp John would 
leave]]]] 

b. [tp I wonder [cp when Mary said [cp John would [yp leave 

when]]]] 

Following standard conceptions, the initial branch Tinu is responsible for 
the external merge of lexical items and for overt applications of internal 
merge. Therefore, immediately before the generative procedure F splits into 
Text and Tabs, there is no ambiguity as to consituency and chain- formation. 
In 1(4)1 and |(5)| concretely, either objects (a) or objects (b) have been gen- 
erated by Tinit- There is, though, an apparent asymmetry between Tats 
and Text' whereas in the former the information relative to constituency 
and chain-formation is preserved, it is lost in the latter, since constituent 
structures are transformed into strings of morphemes, which do not dis- 
play any discrete morphophonological correlate to syntactic nodes (or, say, 
parentheses in first order logic) and, typically, all copies but the head of 
the chain are deleted in Text (they are not pronounced). This triggers the 
appearance of syntactic ambiguities, as those illustrated in |(4)| and |(5)[ or 
|(6)[ which contains an object that can be both a disjunction of a term and a 
coordination ( (6-a) ), and a conjunction of a disjunction and a term (|(6-b)[). 



(6) FU call Peter or Mary and John 

a. Fll call [Peter or [Mary and John] ] 

b. Fll call [ [Peter or Mary] and John] 

The apparent asymmetry just noted requires further consideration. Al- 
though, typically, there is no morphophonological discrete symbol repre- 
senting non-final copies of a chain and no discrete symbol representing con- 
stituency, experimental research has shown that the presence of syntactic 
boundaries may be signaled through non-discrete prosodic features such as 



duration, amplitude and frequency peaks (cfr. Prieto (19971 and references 
therein). Let us consider, for concreteness, some evidence in Catalan. It is 
relatively easy to construct in this language sentences that are both lexi- 
cally and syntactically ambiguous. In |(7)[ for instance, jove ('young') can 
be either a noun or an adjective; if it is a noun ( |(7)[ ), then veu is a verb 
('sees'), I' an article ('the') and amenaga a noun ('threat'); ii jove is an 
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adjective, then veu is a noun ('voice'), l' an object clitic and amenaga a 
verb ('threaten'). The written sentence under discussion can thus have the 
two different syntactic analysis indicated. 



(7) La jove veu I'amenaga (Ferrater f981) 



a. 



b. 



d. 



jove: noun 
veu: verb 
I': article 
amenaga: noun 

[vp [dp la [n jove]] [vp [v veu] [dp I'amenaga]]] 
'The young lady sees the threat' 

jove: adjective 

veu: noun 

I': clitic 

amenaga: verb 

[vp [dp la [AdjP jove [n veu]]] [v V[v amenaga]]] 

'The young voice threatens him/her' 



However, as formerly noted by Ferrater (Ferrater 
sentence illustrated 



1981 



80), when the 
(7) is pronounced the ambiguity disappears. As 



observed by Bonet (1984), the two different interpretations have different 



intonational patterns. More precisely, as shown by Prieto (19971, an inter 



mediate phrase boundary is phonetically marked by a high phrase accent 
H~ placed at the end of a phonological constituent; accordingly, the bound- 
ary tone H~ is placed after jove when jove is a noun, and thus the subject 
of the sentence is the DP la jove ( "the young lady" ) , and after veu when 
jove is an adjective and the subject is la jove veu ("the young voice"). 

Prieto's perceptual results reveal an interesting fact: when the phonologi- 
cal phrasing is [la jove] [veu I'amenaga], with a phrase boundary after jove, 
hearers assign it a sole interpretation, namely that where the subject is 
la jove, as expected, in contrast with the phonological phrasing [la jove 
veu] [I'amenaga], which is ambiguous: the absence of a phrase boundary 
between jove and veu is compatible with the two possible syntactic anal- 
ysis, whereas the presence of such boundary is compatible only with one. 
This suggests that "linguistic context is the key disambiguation factor in 
natural speech, and t hat phrasing decisions, subject to certain constraints, 
are highly optional" (iPrietol |l997l p. 187-i»s^ ^ 
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Let us consider a further complication brought up by 1(8)1 an example due 
to J. Mascaro (p.c). In this case, el can be either an article or a clitic; if it 
is a an article, then pot is a noun ("can"), dur an adjective ("hard") and 



cap a verb ("fits") ((8-a) I; if el is a clitic, then pot is an existential modal 



verb ("can"), dur an infinitive verb ("bring"), and cap a a complex prepo- 



sition ("towards") ((8-b)|. But in|(8)[ the two possible syntactic analysis 



are mapped into the same intonational phrases( (8-c) I. Further work should 
investigate if any other prosodic factors are at work in producing differen- 
tiated phonetic forms and to what extent these features are perceived. 

(8) El pot dur cap a I'armari 

a. [tp [dp el [np pot [AdjP Udj dur]]] [tp cap [pp a I'armari] ] ] 
'The hard can fits in the cupboard' 

b. [modaip [modal el pot] [v p [v° dur] [pp Cap a I'armari] ] ] 
'He can bring it to the cupboard' 

c- [intP2 el pot dur] [intpi cap a I'armari] 

Therefore, prosody seems to play an auxiliary role in indicating how words 
are combined into phrases. The amount of constituency ambiguities is dras- 
tically reduced if not only discrete features but also non-discrete prosodic 
features are taken into consideration. As a consequence, produced phonetic 
forms may display a higher resolution in this respect than a superficial 
glance would suggest. Although prosodic patterns are available to avoid 
the appearance of constituency ambiguities in phonetic forms, they may 
be used only as optional, and not mandatory, strategies and perhaps in a 
gradual way, depending on pragmatic factors. For instance, speakers may 
take advantage of the continuous nature of prosodic features to minimize 
their efforts in externalizing an expression by resorting to relatively marked 
prosodic patterns only to avoid ambiguities that could not be easily solved 
by taking into consideration the communicative context. In the end, a pho- 
netic form with a relatively high resolution may not be an optimal strategy 
to avoid the appearence of an ambiguity that could be easily solved on 
pragmatic grounds. It is reasonable to conjecture that a speaker will mea- 
sure his/her effort in externalizing an expression by considering the efforts 
that the hearer will have to make in order to interpret that expression. It is 
also important to recall the presence of noise: as noted, produced phonetic 
forms with an unambiguous phonological phrasing can indeed be perceived 
as ambiguous by hearers. 
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Finally, we must keep in mind that the match between intonational phrases 
and syntactic phrases is far from trivial, as illustrated in the following well- 



known example (Chomsky and Halle (1968), Selkirk (19861), where the 
intonational phrases do not correspond to the syntactic phrases. 



(9) This is the cat that caught the rat that stole the cheese 

a. [ip This [vp is [dp the cat [cp that [vp caught [up the rat 
[cp that [vp stole [dp the cheese]]]]]]]]] 

b. [intP3 This is the cat] [intP2 that caught the rat] [intpi that 
stole the cheese] 



Needless to say, more experimental work is still necessary to have a more 
general and solid understanding of how prosody is used in production and 
perception to disambiguate otherwise ambiguous sentences, such as those 
1 1(4)1 and (6-b) as well as those in |(10)[ for instance, where a postnomi- 



nal adjective can be both a noun complement and a secondary predication, 
thereby differing from languages like English where the postnominal adjec- 
tive would introduce a secondary predication. 



(10) Ha trobat el Uibre interessant 

a. He has found the interesting book 

b. He has found the book interesting 



2.3 Quantifier scope ambiguity 



The existence of quantifier scope ambiguities in natural languages has been 
an interesting conundrum in the development of a formal theory of the 



syntax-semantics interface (Montague (1974), Cooper (1983), May (19851, 



among many others). Consider the following two examples. 



(11) Every linguist knows two languages 

(12) Peter believes that one of his friends has betrayed him 



The sentence in (11) illustrates that in English, as in many other languages, 
given an object numeral phrase headed by two and a subject universal quan- 



2 THE LOCUS OF AMBIGUITY IN THE ARCHITECTURE OF LANGUAGEU 



tificational phrase headed by every, it is possible that the subject universal 
quantificational phrase scopes over the object numeral phrase (a scope re- 
lation that reflects the apparent constituent ordering), but also that the 
object numeral phrase scopes over the subject universal quantificational 
phrase (a scope relation that disagrees with the apparent constituent or- 
dering). The sentence in |(12)| illustrates a similar phenomenon: the DP one 
of his friend can have wide or narrow scope with respect to the proposi- 
tional attitude verb believe, a lexically intensional element, thereby yielding, 
respectively, a de re or a de dicto interpretation. 

Scope ambiguities in languages like English contain no lexical ambiguity 
and apparently there are no independent arguments to postulate differ- 
ent syntactic representations and derivations for different scope relations. 
This poses an obvious difficulty to mantain the Compositionality Principle, 
understood as follows: 



Compositionality Principle. The meaning of a complex ex- 
pression is a function of the meaning of its parts and of the syn- 



tactic rules by which they are combined. (Partee et al. , 1990 
p. 318) 



As observed by Pelletier Pelletier (1993): 



"In order to mantain the Compositionality Principle, theorists 
have resorted to a number of devices which are all more or 
less unmotivated (except to mantain the Principle): Motago- 
vian 'quantifying-in rules', 'traces', 'gaps', 'Quantifier Raising'... 
features, and many more" . 



Although Pelletier's criticism of Montagovian and Government and Bind- 
ing approaches may be fair, it seems to us that it does not apply to current 



minimalist transformational proposals. As observed by Chomsky (2004), if 
we allow r to generate hierarchical objects by externally merging elements 
from Lex into the syntactic workspace but also by internally merging ele- 
ments that have already been introduced into the syntactic workspace, we 
can constructively capture the property of displacement, which is appar- 
ently ubiquitous in natural languages. Traces do not need to be generated, 
but simply multiple copies. Moreover, the postulation of internal merge 
operations is not only a matter of applicability but also a default option: if 
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external merge is available, only by stipulation could we ban the syntactic 
procedure to perform internal merge operations, a stipulation that would 
be unmotivated given our current understanding of syntactic patterns.'* 

If we allow internal merge operations to take place not solely at Tinu but 
also during Tabs-, we obtain the minimalist correlate of quantifier raising. 
As a consequence, different syntactic derivations now yield different scope 
relations and the syntax-semantics interface is kept as simple as possible, 
governed by the Compositionality Principle but without stipulating any 
unmotivated elements. Internal merge operations responsible for defining 
quantifier scope are covert in a language like English, which means that they 
take place during Tabs and that they have no correlate at Tf^^f Therefore, 
n displays also a lower resolution than A as for quantifier scope. 



2.4 Garden path sentences 



Garden path sentences appear when a substring of words of a given string 
is in principle ambiguous, but one of its interpretations is more likely than 
the others. In parsing the whole string, the most likely interpretation of the 
substring must be abandoned for interpretive reasons, and an alternative 
interpretation must be found. English sentences |(13)| and |(14)| are well- 
known illustrations of this phenomenon. 



(13) The government plans to raise taxes were defeated 

(14) The horse raced past the barn fell 

It is relatively easy to construct garden path sentences in English, arguably 
because of the type of ambiguity allowed in this language, such as those 
derived from the omission of the relativizer, as in |(14)[ In Romance lan- 



guages, which lack this type of ambiguities, it is not so easy. In|(15) 



we 



provide an instance of garden path sentence in Catalan due to |Rossello 
( [2008i ) that does not derive from relativizer omission. 

(15) Hem de redirigir les tendencies cap a la dreta cap a I'esquerra 

'We must redirect the tendencies towards the right towards the left' 
(literal translation) 

a. First interpretation or syntactic analysis: 
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Hem de [vp redirigir [dp les tendencies]] [pp cap a la dreta] 
*[pp cap a I'esquerra] 
b. Second interpretation or syntactic analysis: 

Hem de [yp redirigir [^p les tendencies [pp cap a la dreta]] 
[pp cap a I'esquerra]] 

In the first bracket analysis the PP cap a la dreta ( "towards the right" ) 
is analyzed as an adjunct of the VP; when the next PP cap a I'esquerra 
("towards the left") is parsed, the syntactic analysis crashes, since it is not 
possible to assign this PP any adjunct or argument relationship. In the 
second bracket analysis the first PP cap a la dreta is a modifier of the DP 
les tendencies, and thus the second PP cap a I 'esquerra can now be properly 
interpreted as an adjunct of the VP. 

The source of these garden path sentences is syntactical. It is also possible 
to construct garden path sentences in Catalan whose source is phonological 
(Joan Mascaro, p.c). Consider the following sentence: 

(16) Els reis Joan Carles i Sofia 

'The kings Joan Carles and Sofia'. 

Due to the contextual deletion of the two occurrences of the plurality mor- 
pheme -s attached to the article and the noun, [(16)] is pronounced as 

(17) El rci Joan Carles i Sofia. 

As a consequence, the string [(17) [ is morphophonologically ambiguous: the 
hearer can interpret it as the coordination of a singular DP el rei Joan 
Carles and the noun Sofia, which does not correspond to the intended 
meaning of the expression before the deletion of the two final occurrences 
of the plurality morpheme -s. In order to arrive at the intented meaning, 
the hearer needs to reconstruct the two deleted -s, thereby obtaining a 
plural DP that contains two coordinated nouns (i.e., Els reis Joan Carles i 
Sofia). 

The existence of garden path sentences could be viewed as a sign of the 
poor design of language for communication, provided that it poses parsing 
difficulties. However, garden path sentences derive from the possibility of 
having ambiguous strings of words and from the non-uniform probability 
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among the interpretations of a string. That probabihties among interpreta- 
tions may be non-uniform comes for free; if the property of strings of being 
ambiguous can be derived from efficient communication considerations -as 
we shaU argue-, then we can safely conclude that garden path sentences 
are not a sign of the poor design of language for communication. 



2.5 Concluding remarks on the locus of ambiguity 

Before concluding this section we want to raise a further issue. It is impor- 
tant to distinguish ambiguous expressions from vague expressions. Whereas 
an ambiguous expression is an expression with more than one meaning, a 
vague expression is an expression whose meaning is imprecise or vague. 
Some clear examples of vague predicates in natural languages are the En- 
glish adjectives tall, young, cold, rich and smart. We consider them vague 
for there are some objects that clearly display these properties, others that 
do not display them whatsoever, and others that display them or that do 
not display them 'to a certain degree'. In general, a predicate P is vague iff 
the cut-off between satifying P and not satisfying P is blurry, which means 
that there will be some borderline object x for which it is impossible to 
determine whether or not x satisfies P. 

Many predicates in natural languages are vague, and do not adhere to 
the tertium non datur law, according to which, for a given proposition p, 
either the formula 'p' or the formula 'not p' is true. Whereas mathematical 
reasoning characteristically adheres to the tertium non datur law, most 
non-mathematical reasonings are full of items that denote vague predicates 
for which it is not determined whether or not some objects satisfy them.^ 

For its ubiquity in natural languages and human reasoning, it is crucial 
to have a good mathematical characterization of vagueness, not only for 
linguistics (how should we represent the lexical meaning of vague terms 
and their contribution in the meaning of complex expressions?), but also 
for logic (how should we characterize a correct reasoning involving vague 
terms?) and artificial intelligence (how can we efficiently implement vague 
reasonings in artificial intelligence systems?).^ However, we emphasize that 
vagueness is not under the scope of the communicative argument we are 
about to develop, which is concerned strictly with ambiguity. The existence 
of vague terms is not related at all to the factors of efficient communication 
we will present. We also want to note that the deictic words such as the 
noun today, the adverb here or the pronoun / are not instances of ambiguity. 
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and thus are unrelated to the course we shah develop. The meaning of these 
words is unambiguosly connected to the speech act in such a way that they 
always mean, respectively, "the day of the speech act" , "the place of the 
speech act" and "the logophoric agent of the speech act" . 

So far we have reviewed certain types of ambiguity that are familiar to 
grammarians: lexical ambiguity, syntactic ambiguity, and scope ambiguity. 
We have considered further elementary distinctions: lexical ambiguity can 
be divided into (total and partial) homonymy and polysemy, and syntactic 
ambiguity into constituency ambiguity and chain-formation ambiguity; we 
have also analyzed several Catalan constructions that display at the same 
time lexical ambiguities and constituency ambiguities. The general claim of 
this section is that ambiguity appears at the externalization branch (Text) 
of the generative procedure. More precisely: 

1. Lexical ambiguity. Ambiguity relative to lexemes appears at the ex- 
ternalization branch, when a hearer needs to resort to the communica- 
tive or the linguistic context to choose one meaning among multiple 
meanings that can potentially be associated to a morphophonological 
form. 

2. Syntactic ambiguity and prosody. Prosodic patterns play an ancil- 
lary role in avoiding the emergence of constituency ambiguities in 
the production of phonetic forms by indicating the limits between 
constituents; although some phonetic forms are unambiguous as for 
phrasing in production, they are ambiguous in perception. Chain- 
formation ambiguities are triggered by the deletion of base-generated 
copies at T^xt- 

3. Quantifier scope ambiguity. Internal merge, which is constructively 
used to analyze displacement patterns in general, can be naturally ap- 
plied to represent inverse scope relations, without needing to stipulate 
any unmotivated elements. Scope ambiguities appear because inter- 
nal merge operations responsible for defining inverse scope relations 
take place at Tabs, and thus have no effect on T^xt- 

After locating the emergence of ambiguity in the architecture of language, 
we are now ready to investigate what factors relative to the externalization 
of linguistic expressions require a certain loss of that information relevant 
for the C-I system, in order to understand why the generative procedure 
opts for reducing the resolution of 11 with respect to A in connecting the C-I 
system and the A-P system. This leads us to move from grammar-internal 
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considerations to the study of general conditions on how two agents com- 
municate efficiently. Accordingly, we shall present in the following section 
a rigorous and general framework based on fundamental concepts of com- 
putation theory and information theory/ 



Ambiguity and logical reversibility: a rig- 
orous treatment 



In this section we begin by presenting the concept of Logical (ir)reversibility 
of a given computation. Subsequently we formally explain how a code gen- 
erated through logically irreversible computations is necessarily ambiguous 
and we quantify the degree of ambiguity as the minimum amount of infor- 
mation needed to properly reconstruct a given message. 



3.1 Logical irreversibility 



The concept of computation is theoretically studied as an abstract process 
of data manipulation and only its logical properties are taken into consid- 
eration; however, if we want to investigate how an abstract computation is 
realized by a physical system, such as an electronic machinery, an abacus 
or a biological system as the brain, it becomes important to consider the 
connections between the logical properties of (abstract) computations and 
the physical -or more precisely, thermodynamical- properties of the system 
that perform those computations. 

The fundamental examination of the physical constraints that computations 
must satisfy when they are performed by a physical system was started by 



Landauer ( 1961 ), and continued in several other works Bennett ( 1973 ); Ben- 



nett and Landauer (1985); Ladyman et al. (2007); Bennett (2008); 



Toffoh 



( 1980 ) . The general objective of these approaches is to determine the phys- 



ical limits of the process of computing, the "general laws that must govern 



all information processing no matter how it is accomplished" ( Bennett and 
pg.48) 



Landauer 1985 



Therefore, the concept of computation is subject 
to the same questions that apply to other physical processes, and thus the 
following questions become central to the physical study of computational 
devices (Bennett and Landauer 1985 p. 48): 
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Figure 2: A computation is said to be logically irreversible if the input 
cannot be univocally defined only with the knowledge of the output. Here 
we have two examples of simple logic gates performing irreversible compu- 
tations. The AND gate (which corresponds to the logic connective "A") 
has the truth table shown on the bottom. Although the output "1" can 
be uniquely obtained through the combination "11" as input, the input for 
the output "0" can be either "10" ,"01" or "00". The presence of a "0" 
in the output is not enough to properly revert the computational process, 
and therefore we need an amount of extra information if we want to know 
the inputs with no errors. Therefore, the computations of this gate are 
logically irreversible. The same occurs with the OR gate, corresponding to 
the logical connective "V". This example has been taken from [Bennett and] 
Landauer (19851. 



1. How much energy must be expended to perform a particular compu- 
tation? 

2. How long must it take? 

3. How large must the computing device be? 

A central objective is to study the reversibility/irreversibility of a compu- 
tational process.^ Let us thus introduce the concept of logical reversibil- 
ity/irreversibility, which is crucial to our concerns. 



As remarked by Bennett and Landauer (1985), no computation ever gener- 



ates information since the output is implicit in the input. For instance, if we 
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consider the operation of adding (+) defined on the set of natural numbers 
N, then the expression +(2, 3) contains its output, 5; accordingly, we say 
that the output 5 is implicit in the input (2, 3) of the operation + defined 
on N, in which case all the information contained in the output is contained 
in the input. However many operations destroy information whenever two 
previously distinct situations become indistinguishable, in which case the 
input contains more information than the output. If we consider again the 
operation + defined on N, the output 5 can be obtained from the following 
inputs: (0, 5), (1, 4), (2, 3), (3, 2), (4, 1), (5, 0) The concept of logical 
irreversibility is introduced in Landauer ( [Landauer 1961 pg. 264) in order 



to study those computations for which its input cannot be unequivocally 
determined from its output: 

"We shall call a device logically irreversible if the output of a 
device does not uniquely define the inputs" . 



Conversely, a device is logically reversible if its output can be unequivocally 
defined from the inputs -see figure (pi). 



3.2 Logical (ir) reversibility in terms of Turing ma- 
chines 



In this section we rigorously define logically (ir)reversible computations. 
To this end, we must present an abstract computing device, i.e., a Turing 
machine. A Turing machine is an ideal calculus algorithm defined by|Turirig| 



( 1937 ) . Turing machines seem to constitute a stable and maximal class 
of computational devices in terms of the computations they can perform. 
Although Turing machines are rather primitive and simple, they are capable 
of expressing any algorithm and of simulating any programming language, 
and indeed, any operation that a modern computer can perform can be 
simulated by a Turing machine. In fact, it is widely accepted that any way 
of formalizing the intuitive idea of what is computable must be equivalent to 
a Turing machine, a conjecture commonly known as Church- Turing thesis 



(Davis 1985) 



We informally describe the action of a Turing machine on an infinite tape 
divided into squares as a sequence of simple operations that take place after 
an initial moment. At each step, the machine is at a particular internal 
state and its head examines a square of the tape (i.e., reads the symbol 
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written on a square). The machine subsequently writes a symbol on that 
square, changes its internal state and moves the head one square leftwards 
or rightwards or remains at the same square. 

Formally, a Turing machine T is composed of a finite set of internal states 
Q, a finite set of symbols E (an alphabet) and a transition function 6: 

There is an initial state s belonging to Q and in general Q n E = 0. Two 
special symbols, the blank U and the initial symbol >, belong to E. Addi- 
tionally 5 is a transition function that takes as input an ordered pair and 
yields as output a triple. The first component of the ordered pair taken as 
input can be any member of Q and the second component any element of 
E; therefore, the domain of S is the Cartesian product Q x E. The first 
component of the triple yielded as output is any member of Q, the sec- 
ond component any member of E and the third component any member of 
{L, R, D}, where 'L' and 'i?' mean, respectively, "move the head one square 
leftwards or rightwards" , and 'D' means "stay at the square just examined" . 
Accordingly, the range of S is the Cartesian product QxExji, i?, D}. We 
succintly define S as usual: 

(5 : Qx E^ Qx E X {L,i?,n}. 

Thus, 6 is the program of the machine; it specifies, for each combination of 
current state ai € Q and current symbol r^ g E, a triple 

{aj,re,D), 

where aj is the step immediately after <Ti , ri is the symbol to be overwritten 
on r/c, and D ^ {L, R,D}. A schema of how a computation is performed 
in a Turing machine is shown in figure ([3]). 

In general, we say that T performs logically reversible computations if the 
inverse function of S, S^^, defined as 

S^^ : Q X J: X {L, R,D} ^ Q X E, 

exists. This implies that for any input we have a different output and, 
therefore, we can invert the process for every element of the input set. The 
non-existence of S~^ is due to the fact that 

{3a, /3 e g X E) : ((a 7^ /3) A ((5(a) = S{l3) e Q x Y. x {L, R, D})). 

Therefore, from the knowledge of 7 = S(a) = 6{/3) we cannot determine 
with certainty the actual value of the input, either a or 13 -see figure ^ 
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Figure 3: The sequence of a given computation in a Turing machine -see 
text. 



for a simple example. In these cases, we say that T performs logically 
irreversible computations. 

After these general definitions, we shall provide a particular definition of a 
Turing machine suitable for the study of the coding process. This coding 
machine will be compounded of a set of internal states Q, a transition 
function S and two alphabets -an input alphabet fl = {mi, ...,m„} and an 
output alphabet 5*= {si,...,Sm}: 

r={Q,n,s,s). 

We shall call the elements of ft referents and the elements of S signs. We 
assume that Qn 5 = and that Qn fi = 0. For simplicity, we also assume 
that, in a coding process, the two alphabets are disjoint. 



nf]s- 



i.e., an object is either a sign or a referent, but not both. Technically, this 
implies that T can never reexamine a square where a sign has been printed, 
which means that T must move always in the same direction; assume for 
concreteness that it must move rightwards. In order to clearly identify 
input and output configurations, we express the applications of S in the 
following terms: 

akm.i -> ajSgR, 

where akrui and OjSiR, [uk, <Jj ^ Q, rrii ^ fl, si G S) are respectively input 
and output configurations. 
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For the sake of simplicity, we will consider only deterministic Turing ma- 
chines, i.e., those machines for which there is a different input configuration 
for each different output configuration. That a coding machine is determin- 
istic means that there are no properly synonymous signs that encode the 
same referent. A determistic Turing machine will be logically reversible 
if each output configuration is obtained from only one input configuration, 
and irreversible if more than one input configurations yields a single output 
configuration. In other words, T is ambiguous iff its S is not injective. A 
logically irreversible coding machine generates, thus, ambiguous signs, for 
which there is not a unique referent. 



As shown by Bennett ( 1973 ), a logically irreversible Turing machine can al- 
ways be made logically reversible at every step. Thus, logical irreversibility 
is not an essential property of computation. It is crucial for our concerns 
that a logically reversible machine need not be much more complicated than 
the irreversible machine it is associated with: computations on a reversible 
machine take about twice as many steps as on an irreversible machine and 



require a particular amount of temporary storage Bennett (1973); Ben- 



nett and Landauer (1985). Therefore, the study of the complexity of the 
computations of the coding device alone does not seem to offer a necessity 
argument for the emergence of ambiguity but only a relatively weak plausi- 
bility argument, provided that reversible computations are not significantly 
more complex than irreversible computations. Here we shall show that 
ambiguity must appear when a coding machine interacts with a decoding 
machine in an optimal way. 



3.3 Logical reversibility and ambiguous codes 

Before proceeding further in studying the concepts of logical (ir)reversibility 
in a communication system formed by two agents (a coder and a decoder) 
and the channel, some clarifications are in order. Firstly, we note that 
logical reversibility refers to the potential existence of a reconstruction or 
decoding algorithm, which does not entail that, in a real scenario, such 
algorithm is at work; in other words, logical (ir)reversibility is a feature 
of the computations alone. Secondly, in the process of transmitting a 
signal through a channel, the presence of noise in the channel through 
which the output is received may be responsible for the emergence of log- 
ical irreversibility. This implies that, although the coder agent could in 
principle compute in a reversible regime, the noise of the channel makes 
the cascade system coder agent + channel analogue to a single computa- 
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Figure 4: Logical irreversibility: after a computation performed by a Turing 
machine, can we univocally define the input with the only knowledge of the 
output? If the answer is no, the computations are said to be logically 
irreversible and the observer of the elements of the output set S must face 
an amount of uncertainty to recover the elements of the input set Jl. 



tion device working in an irreversible regime. And finally, whereas logical 
(ir)reversibility is a property of the computational device (or coding algo- 
rithm) related to the potential existence of a reconstruction algorithm (or 
decoding algorithm), ambiguity is a property referred to signs: we say that 
a sign (an output of a coding computation) is ambiguous when the decoder 
can associate it with more than one referent (or input). A sign transmitted 
through a channel is ambiguous if the cascade coder agent + channel is 
logically irreversible, which may be due to the computations of the coding 
agent itself or due to the noise of the channel. 



3.3.1 Noise: quantifying the degree of ambiguity 

The minimal amount of additional information needed to properly recon- 
struct the input from the knowledge of the output is identified as the quan- 
titative estimator of ambiguity. The more additional information we need, 
the more ambiguous the code is. This minimal amount of dissipated infor- 
mation is known as noise in standard information theory, and its formu- 
lation in terms of the problem we are dealing with is the objective of the 
following two subsections. 



To study logical irreversibility in information-theoretical terms, we choose 



3 AMBIGUITY AND LOGICAL REVERSIBILITY: A RIGOROUS TREATMENT26 

a simple version of the transition function S 

s -.n^ S. 

This choice puts aside the role of the states but is justified for the sake of 
clarity and because the qualitative nature of the results does not change: 
the only changes are the sizes of the input and output sets. Let 6ij be a 
matrix by which 

*■' [0 otherwise. 

Since the machine is deterministic, there is no possibility of having two 
outputs for a given input, therefore 

(Vfc < n){3U < m) : [(4^ - 1) A (Vj ^ i){5uj = 0)] . 

To properly study the reversibility of the above coding machine, let us define 
two random variables, Xi^^Xs- X^ takes values on the set U, following the 
probability measure p, being p{mk) the probability to have symbol mk as 
the input in a given computation. Essentially, fl describes the behavior of a 
fluctuating environment. Xg takes values on S and follows the probability 
distribution q, which for a given Si € S, takes the following value: 

i.e., the probability of obtaining symbol Si as the output of a computation. 
The amount of uncertainty in recovering the inputs from the knowledge of 
the outputs of the computations performed by T is related to the logical 
irreversibility. In fact, this amount of uncertainty is precisely the amount 
of extra information we need to introduce to have a non-ambiguous code. 
This amount of conditional uncertainty or extra information needed is well 
defined by the uncertainty function or Shannon's conditional entropy^: 

H{Xn\X,) = -J2 g(sfc)^PK|sfc)logP(m,|sfc), (1) 

where, by virtue of Bayes' theorem, 

P(m,|sfe)= ij2^^k\ S,k- 

\i<n J 

Equation (IT]) is the amount of noise, i.e., the information that is dissipated 
during the communicative exchange or, conversely, the (minimum) amount 
of information we need to externally provide to the system in order to 
perfectly reconstruct the input. 
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3.3.2 Ambiguity and logical irreversibility 

The interpretation we provided for the noise equation enables us to rigor- 
ously connect ambiguity and logical (ir)reversibility. First, we emphasize a 
crucial fact: by the properties of Shannon's entropy, 

H{Xn\Xs)>Q, 

which explicitly states that information can be either destroyed or man- 
tained but never created in the course of a given computation -as pointed 



out in Bennett and Landauer ( 1985 ) 



If there is no uncertainty in defining the input signals by the only knowledge 
of the outputs, then 

i.e., there is certainty when reversing the computations performed by the 
coding machine. Therefore, the computations performed by T to define the 
code are logically reversible and the code is not ambiguous. Otherwise, if 

H{Xn\Xs)>0, 

then, we need extra information (at least H{X[2\Xs)) to properly reverse 
the process, which indicates that the computations defining the code are 
logically irreversible and, thus, that the code is ambiguous. 

We therefore identified in a quantitative and rigorous way the ambiguity 
of the code with the amount of uncertainty of the reversal of the coding 
process or the minimal amount of additional information we need to properly 
reverse the coding process. Furthermore, we clearly identified the source 
of uncertainty through a rigorous concept, namely, logical irreversibility, 
which is a feature of the computations generating the code. In this way, we 
establish the following correspondences: 

logically reversible computations <(=> No Ambiguity -i^ H{Xn\Xs) = 

logically irreversible computations <=> Ambiguity <^ H{Xq\Xs) > 

Amount of Ambiguity = H{Xn\Xs). 

Now that we rigorously defined ambiguity on solid theoretical grounds of 
computation theory and information theory, we are ready to explain why it 
appears in natural communication systems. As we shall see in the following 
section, the reason is that natural systems must satisfy certain constraints 
that generate a communicative tension whose solution implies the emer- 
gence of a certain amount of ambiguity. 
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4 The emergence of ambiguity in natural com- 
munication 

The tension we referred to at the end of the last section was postulated by 



the linguist G. K. Zipf Zipf (1949) as the origin of the widespread scaling 
behavior of word appearance having his name. Such a communicative ten- 
sion was conceived in terms of a balance between two opposite forces: "the 
speaker's economy force" and the "auditor's economy force" . 



4.1 Zipf's hypothesis 

Let us thus informally present Zipf's vocabulary balance between two op- 
posite forces, "the speaker's economy force" and the "auditor's economy 
force" ( Zipfj 1949[ pp. 19-31). The speaker's economy force (also called 



Unification Force) is conceived as a tendency "to reduce the size of the vo- 
cabulary to a single word by unifying all meanings", whereas the auditor's 
economy force (or Diversification Foce) "will tend to increase the size of 
a vocabulary to a point where there will be a distinctly different word for 
each different meaning" . Therefore, a conflict will be present while try- 
ing to simultaneously minimize these two theoretical opposite forces, and 
the resulting vocabulary will emerge from a cooperative solution to that 
conflict. In Zipf's words, 

"whenever a person uses words to convey meanings he will au- 
tomatically try to get his ideas across most efficiently by seeking 
a balance between the economy of a small wieldy vocabulary of 
more general reference on the one hand, and the economy of a 
larger one of more precise reference on the other, with the result 
that the vocabulary of n different words in his resulting flow of 
speech will represent a vocabulary balance between our theo- 
retical Forces of Unification and Diversification" (Zipf 1949 p. 
22). 



Obviously the Unification Force ensures a minimal amount of lexical ambi- 
guity, since it will require some words to convey more than one meaning, 
and the Diversification Force constrains such amount. Thus, lexical ambi- 
guity can be viewed as a consequence of the vocabulary balance. Although 
Zipf's vocabulary balance, as stated, provides a useful intuition to under- 
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stand the emergence of lexical ambiguity by emphasizing the cooperative 
strategy between communicative agents, it lacks the necessary generality to 
provide a principled account for the origins of ambiguity beyond the par- 
ticular case of lexical ambiguity. In the following sections we shall present 
several well-known concepts in order to generalize Zipf 's informal condition 
and provide solid fundations for it. 

We remark that Zipf conceived the vocabulary balance as a particular case 
of a more general principle, the Least Effort Principle, "the primary princi- 
ple that governs our entire individual and collective behaviour of all sorts. 



including the behaviour of our language and preconceptions" (Zipf 1949 
p. 22). In Zipf's terms, 

"the Principle of Least Effort means, for example, that a per- 
son in solving his immediate problems will view these against 
the background of his probable future problems as estimated by 
himself. Moreover he will strive to solve his problems in such 
a way as to minimize the total work that must be expended 
in solving both his immediate problems and his probable fu- 
ture problems. That in turn means that the person will strive 
to minimize the probable average rate of his work-expenditure 
(over time). And in so doing he will be minimizing his effort, 
by our definition of effort. Least effort, therefore, is a variant of 
least work." 



Hence, we consider the symmetry equation between the complexities of the 
coder and the decoder we shall arrive at to be a particular instance of the 
Least Effort Principle. 



4.2 Symmetry in coding /decoding complexities 

How can we accommodate the previous intuitions to the rigorous framework 
proposed in section [3]/ The auditor's economy force leads to a one-to-one 
mapping between fl and S. In this case, the computations performed by 
T to generate the code are logically reversible and thus generate an unam- 
biguous code, and no supplementary amount of information to successfully 
reconstruct Xq is required. However, the speaker's economy force conspires 
exactly in the opposite direction. In these latter terms, the best option is 
an all-to-one mapping, i.e., a coding process where any realization of Xq 
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is coded through a single signal. Such a coding is logically irreversible and 
therefore totally ambiguous, for it is clear that the knowledge of the output 
tells us nothing about the input. In order to characterize this conflict, let 
us properly formalize the above intuitive statement: The auditor's force 
pushes the code in such a way that it is possible to reconstruct Xq through 
the intermediation of the coding performed by T. Therefore, the amount 
of bits the decoder of Xg needs to unambiguously reconstruct Xq is 

H{Xn, Xs) = -J2Y1 IP(™^' «fc) ^ogP{m„ Sk), 

i<n k<n 

which is the joint Shannon entropy or, simply, joint entropy of the two 



random variables Xq, X^ Cover and Thomas ( 1991 ). From the codification 



process, the auditor receives H{Xs) bits, and thus, the remaining uncer- 
tainty it must face will be 

H{Xn,Xs) - H{Xs) - H{Xn\X,), 

where 



H{Xs)^~Y,q{s,)\ogq{s,), 

(i.e, the entropy of the random variable Xg) and 

iI(Xi,|X,) = -^(Z(s,)^P(mfc|s,)logP(mfe|s,), 

i<n k<n 

the conditional entropy of the random variable Xq conditioned to the ran- 
dom variable Xg. At this point Zipf's hypothesis becomes crucial. Under 
this interpretation, the tension between the auditor's force and the speaker's 
force is cooperatively solved by imposing a symmetric balance between the 
efforts associated to each communicative agent: the coder sends as many 
bits as the additional bits the decoder needs to perfectly reconstruct Xq: 

H{X,)=H{Xq\X,). (2) 

This is the symmetry equation governing the communication among coop- 
erative agents when we take into account computational efforts -which have 
been associated here with the entropy or complexity of the code^''. Selec- 
tive pressures will push H{Xs) and, at the same time, by equation (pi), the 
amount of ambiguity will also grow, as a consequence of the cooperative 
nature of communication.^^ 

Equation (pi) specifies that a certain amount of information must be lost 
(or equivalently, a certain amount of ambiguity must appear) if coder and 
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decoder miniinize their efFors in a symmetric scenario. A further question is 
how much information is lost due to equation (pi). In order to measure this 
amount of information, we must take into consideration the properties of 
the so-called Shannon Information or Mutual Information among the two 
random variables Xs,Xn, to be written as I{Xs : Xq). In our particular 
case, such measure quantifies the amount of information of the input set by 
the only knowledge of the output set after the computations. Consistently 



Cover and Thomas (1991); Ash (1990) 



I{X, : Xn) = H{Xn) - H{Xn\Xs). (3) 

An interesting property of Shannon information is its symmetrical behavior, 
i.e., I{Xs : Xn) = I{Xn,X,). Thus, by equation (|3f, 

H{Xn) - H{Xn\Xs) = H{X,) - H{Xs\Xn), 

where II{Xs\Xi-i) = 0, because the Turing machine is deterministic^^. 
Therefore, by applying directly equation (pi) to the above equation we reach 
the following identity: 

H{X,) = \H{Xn). (4) 

Thus, 



^s : Xn) = 


= H{Xn)-H{Xn\X 


by eg. ^ ^ 


= H{Xn)-H{X,) 


by eq. ^ ^ 


-- \H{Xn). 



The above derivation shows that half of the information is dissipated dur- 
ing the communicative exchange if coding and decoding computations are 
symmetrically or cooperatively optimized. ^^ Accordingly, an amount of 
ambiguity must emerge. Ambiguity is not an inherent imperfection of a 
communication system or a footprint of poor design, but rather a property 
emerging from conditions on efficient computation: coding and decoding 
computations have a cost when they are performed by physical agents and 
thereby it becomes crucial to minimize the costs of coding and decoding 
processes. Whereas studying the process of an isolated coding agent would 
not provide a necessity argument for the emergence of ambiguous codes 



(as noted in section 3.2 following Bennett (1973)), a formalization of an 



appropriately general version of Zipf 's intuitions along the course we devel- 
oped provides a solid and general necessity argument for the emergence of 
ambiguity. 
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5 Discussion 



In this article we have argued that ambiguity appears at the externahza- 
tion branch of language. We have substantiated this claim by reviewing 
certain familiar types of ambiguity: lexical ambiguity, syntactic ambigu- 
ity and quantifier scope ambiguity. We have constructed a communicative 
argument based on fundamental concepts from computation theory and in- 
formation theory in order to understand the emergence of ambiguity in the 
externalization branch, or in other words, why the generative procedure 
opts for reducing the resolution of phonetic forms with respect to logical 
forms in connecting the A-P system and the C-I system. 

We have rigorously identified the source of ambiguity in a code with the 
concept of logical irreversibility in such a way that a code is ambiguous when 
the coding process performs logically irreversible computations. Provided 
that logical irreversibility is not an essential property of computations and 
that a logically reversible machine need not be much more complicated than 
the logically irreversible machine it simulates, we have inquired into how 
a coding machine interacts with a decoding machine in an optimal way in 
order to identify the source of ambiguity. We have rigorously quantified the 
ambiguity of a code in terms of the amount of uncertainty of the reversal of 
the coding process, and we have subsequently formulated the intuition that 
coder and decoder cooperate in order to minimize their efforts in terms of 
a symmetry equation that forces the coder to send only as many bits as 
the additional bits the decoder needs to perfectly reconstruct the coding 
process. Given the symmetric behaviour of Shannon information it has 
been possible to quantify the amount of ambiguity that must emerge from 
the symmetry equation regardless the presence of noise in the channel: 
at least a half of the information is dissipated during the communicative 
process if both the coding and the decoding computations are cooperatively 
minimized. As noted explicitly in Appendix!^ the presence of ambiguity 
associated to a computational process realized by a physical system seems 
as necessary as the generation of heat during a thermodynamical process. 

The interest of the symmetry equation from which we derive a certain 



amount of ambiguity in natural languages is further corroborated in Coromii as- 



Murtra et al. (2011). In this study it is shown that Zipf's law emerges 
from two factors: a static symmetry equation that solves the tension be- 
tween coder and decoder (namely, our symmetry equation^ and the path- 
dependence of the code evolution through time, which is mathematically 
stated by imposing a variational principle between successive states of the 
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code (namely, kullback's Minimum Discrimination of Information Princi- 
ple). We thus conclude this study by emphasizing the importance of the 
symmetry equation for the understanding of how communicative efficiency 
considerations shape linguistic productions. 



Appendix: Ambiguity and physical irre- 
versibility 



Throughout the paper we highlighted the strict relation between the logical 
irreversibility of the computations generating a given code and the ambi- 
guity of the latter. Now we highlight the formal equivalence of the math- 
ematical treatment we proposed to deal with ambiguity at the theoretical 
level with the mathematical formulation of physical/thermodynamical irre- 
versibility. The strict relation of thermodynamic irreversibility and logical 
irreversibility is a hot topic of debate since the definition of the equivalence 



of heat and bits by R. Landauer Landauer ( 1961 ). This equivalence, known 



as Landauer' s principle, states that, for any erased bit of information, a 
quantity of 

fcrin2 

joules are dissipated in terms of heat, being k the Boltzmann constant and 
T the temperature of the system. 

This principle relates logical irreversibility and thermodynamical irreversibil- 
ity. Thermodynamical irreversibility is a property of abstract processes. Al- 
most all processes taking place in our everyday life are irreversible. One of 
the consequences of this irreversibility is the degradation of the energy. The 
common property of such processes is that they generate thermodynami- 
cal entropy. The second law of thermodynamics states that any physical 
process generates a non-negative amount of entropy, i.e., for the process P, 

AS'(P) > 0. 

The units of physical entropy are nats instead of bits. Now suppose that we 
face the problem of reversing the process P -for example, a gas expansion- 
by which AS'(P) > 0. Without further help, the reversion of this process 
is forbidden by the second law: It would generate a net amount of negative 
entropy. Therefore, we will need external energy to reverse the process. In 
the same way, we have seen that 

H{Xn\Xs)>Q, 
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which means that information cannot be created during an information pro- 
cess. A negative amount of H{Xq\Xs) would imply, by virtue of equation 
([3]), a net creation of information. Therefore, we face the same problem. 
Indeed, if we have a computational process C by which Hc{Xq\Xs) > 0, 
the reversion of such a process, with no further external help, would be a 
process by which the computations would generate information. The rever- 
sion, as we have discussed above, is only possible by the external addition 
of information. 

Thus the information flux can only be mantained (in the case where all 
computations are logically reversible) or degraded, and the same applies for 
the energy flux: by the second law, the energy flux can only be mantained 
(in the case of thermodynamically reversible processes) or degraded. We 
can go further. If Q(P) is the heat generated during the physical process, 
physical entropy is defined as 

AS(P)^3£). 

Furthermore, if we consider an ideal computational process C we know, 
from Landauer's principle, that 

Q(C) = fcrin2 X erased bits. 

And we actually know how many bits have been erased -or dissipated. 
Exactly H{Xfi\Xs) bits. Therefore, the physical entropy generated by this 
ideal, irreversible computing process will be: 

ASiC) = k\n2Hc{Xn\Xs). 

As a consequence, logically irreversible computations are thermodynamically 
irreversible. 

With this short exposition we emphasize the general character of logical ir- 
reversibility and ambiguity in natural communication systems. More than 
an imperfection, ambiguity seems to be, for natural communication sys- 
tems, a feature as unavoidable as the generation of heat during a thermo- 
dynamical process. 
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Notes 



-"^It is desirable, but not mandatory. As noted by Thomason in his 'Introduction' 
l |Montague[ |1974J Chapter 1), "[A] by-product of Montague's work (...) is a theory of 
how logical consequence can be defined for languages admitting syntactic ambiguity. For 
those logicians concerned only with artificial languages this generalization will be of little 
interest, since there is no serious point to constructing an artificial language that is not 
disambiguated (p. 4, note 5)" if the objective is to characterize logical notions such as 
consequence. However, it is important for the development of 'Universal Grammar' in 
Montague's sense, i.e., for the development of a general and uniform mathematical theory 
valid for the syntax and semantics of both artificial and natural languages. We refer the 
reader interested on the treatment of ambiguity in Montague Grammar to Thomason's 
'Introductio n' l|Montague| |1974[ Chapter 1) and to Montague's 'Universal Grammar' 
l |Montaguel|1974[ Chapter 7). 

^For the sake of simplicity, we deliberately omit in this paper the mathematical 
distinction between ambiguous grammars and ambiguous languages. It is worth noting, 
however, that the set of well-formed strings of a given language can be generated, a priori, 
by more than one system of syntactic rules. Ambiguous grammars lead to the so-called 
structural ambiguity, which refers, roughly speaking, to the possibility of attributing two 
or more syntactic structures to a given string. Since a given language can be generated by 
more than one grammar system, one can argue that the presence of structural ambiguity 
is contingent, because one could hypothetically find another grammar accounting for 
the studied language without structural ambiguity. This is not, however, the general 
case. Indeed, it can be shown that some languages are inherently ambiguous, i.e., that 
all grammars accounting for them are ambiguous. We refer the interested reader to the 
excellent discussion on this topic provided by |Hopcroft and Ullmaii| ( |1979^ (pp. 99-103). 

^The following sentences are also brought into consideration |Prieto| l[l997^ : 



(18) La vella Uanga I'amenaga 

a. The old lady threatens him/her 

b. The old lance threatens him/her 

(19) La vella escolta la veu 

a. The old lady listens to the voice 

b. The old girl scout sees her 



(20) La poderosa crema la casa 

a. The powerful lady sets the house on fire 

b. The powerful cream helps her get married 

(21) El veil guarda la porta 

a. The old man guards the door 

b. The old guard is wearing it (the scarf) 
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^We refer the reader to [Fortuny and Coroniinas-Murtra| | |2009| for a set-theoretical 
definition of internal and external merge as well as of the basic structural relationships. 

^A well-known and interesting property of vague predicates is that they give rise to 
the Sorites paradox, which can be illustrated by the following reasoning: 

• Po: A man who has no euro is poor 

• Pi ~^ Pi+i- If a- rnan who is poor earns one euro, he remains poor 

• PiOOOOOO- Therefore, a man who has one million euros is poor. 

This reasoning is compounded of the atomic proposition po and the implication 
Pi ~^ Pi+ii which states that for all situations pi in which a man is poor, he remains 
poor in Pi+i after earning one euro. This reasoning is thus the iterated application of a 
modus ponens 1000000 times. Although the argument is correct and the two premises 
are true, the conclusion pioooooo is admittedly false. 

^We refer to JNoguera i Clofent| l [2008[ l for a very interesting presentation of vagueness 
and fuzzy logic. 

'^Both the computation-theoretic and information-theoretic concepts used in the fol- 
lowing section are standard and can be found in any basic reference work on these sub- 
jects. We refer the reader interested in computation-theoretic concepts to the classical 
references [Hopcroft and UUman (1979); Davis|l|1985[l; Lewis and Papadimitriou|(|1997^; 



|Papadimitriou| l |2003[ l; and tO |Ash| | jl990 ; [Cover and Thomas| l |1991[ l for an introduction 
to information theory. 

®The problem of reversibility /irreversibility of a computational process was first pro- 
posed in relation to the problem of heat generation during such a process. In this way, 
it was postulated that any irreversible computation generates an amount of heat -the 
so-called Landauer's principle. We refer the interested reader to [Bennett and Landauer| 
l |1985^ . See also Appendix [A] 

^Throughout the paper, log = log2. 

'^''In the context of this section, complexity has to be understood in the sense of Kol- 
m.ogorov com,plexUy. Given an abstract object, such a general complexity measure is the 
length, in bits, of the minimal program whose execution in a Universal Turing machine 
generates a complete description of the object. In the case of codes where the presence 
of a given signal is governed by a probabilistic process, it can be shown that Kolmogorov 
complexity equals (up to an additive constant factor) the entropy of the code |Cover and| 
|Thomas| | |l991| l. 

'^^ Equations of this kind have been obtained in the past through different approraches 
[Harremoes and Topsoej pOOH ; [Ferrer- i-Cancho and Sole| ( |2003^ . 

'^^Notice that, if the Turing machine is deterministic, every input generates one and 
only one output. The problem may arise during the reversion process, if the computations 
are logically irreversible. 
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'^^This derivation has been performed by assuming that there is no noise affecting the 
process of output set observation. If we assume the more reaUstic situation in which 
there is noise in the process of output observation, the situation is even worse, and, 
actually, I{Xs : Xq) = ^H{Xq) would be considered as an upper bound; therefore, in 
presence of noise in the process of output observation, this equation must be replaced 
by: 

I{Xs : Xn) < l,H(Xn). 
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